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1. INTRODUCTION 

A family of lossless compression methods, allowing exact image reconstruction, 
are evaluated for compressing AVIRIS image data. The methods are based on Differential 
Pulse Code Modulation (DPCM). The compressed data have an entropy of order 6 
bits/pixcl. A theoretical model indicates that significantly better lossless compression is 
unlikely to be achieved because of limits caused by the noise in the AVIRIS channels. 

AVIRIS data differ from data produced by other visible/near-infrared sensors, such 
as Landsat-TM or SPOT, in several ways. Firstly, the data are recorded at a greater 
resolution (12 bits, though packed into 16-bil words). Secondly, the spectral channels are 
relatively narrow and provide continuous coverage of the spectrum, so that the data in 
adjacent channels are generally highly correlated. Thirdly, the noise characteristics of the 
AVIRIS are defined by the channels’ Noise Equivalent Radiances (NERs), and these NERs 
show that, at some wavelengths, the least significant 5 or 6 bits of data are essentially 
noise. 


2. COMPRESSION SCHEME 

The overall scheme adopted for lossless compression comprises three main 
elements: 

(1) prediction of the current pixel's value from prior pixels' values; 

(2) differencing to form a residual; 

(3) encoding the residual using a variable or fixed rate code. 

The residuals are represented using NBIT bits. Any residual outside the range 
.^NBIT-l. i) t0 + ( 2 NBIT- 1 _|) j s an exceedance. For variable rale coding, the residuals 
falling within this range are Huffman-encoded. The resulting codebook is optimal for 
each data set. An exceedance is indicated by the value -2^8 IT' 1 , and its value is 
transmitted in full (16 bits). 

For the methods using optimised predictors, there is an overhead caused by the 
need to transmit prediction coefficients, and this is set at 32 bits per coefficient. This 
overhead is significant. 

3. PREDICTION SCHEMES 

14 prediction schemes have been evaluated. Let xi represents the value of the 
pixel in row (line) i, column j, channel (band) X., and xj j ( X, be its predicted value. 
Residuals are formed according to the expression: 

residual = xi j,X, - nearest integer to( xjj^ ). 

For schemes using optimised coefficients, the coefficients (variously a, b, c, or d) are 
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calculated by the least-squares minimisation of SXxjj^ - *ij t 50 2 » where the summation 
is taken along a line (j varies, i and X fixed). 


Spatial Methods. Fixed Coefficients 
Row: 

Column: 

Two-point Row-Column: 

Three-point Row-Column: xj j ^ 

Spatial Methods. Optimised Coefficients 
Optimised Row: 

Optimised Column: 

Optimised Two-point Row: 

Optimised Two-point Row/Column Row: 

Spectral Method. Fixed Coefficients 
Channel: 

Spectral Methods. Optimised Coefficients 
Mean-corrected: 

One-point channel: 

Two-point channel: 

Three-point channel: 


xij,* = xi-lj,X- 

= (3xi-i,j,X + 3xjj-i,X - 2xj-i j-l,X)/4. 
xij,X = a + bxjj-i^. 

xij^a + bxj.ij^. 

xij.X = a + bxj j-l,X + cx ij-2,X- 
x ij,X = a + bxj j.jx, + cxj.i j^. 

xi,j,X = xij > X-l- 

x i,j,X * a + x ij,\-l- 
xi,j,X = a + bxjj^-l. 


x ij,X. = a + bxjjX -1 + cxij,X- 2 - 
xij,X. = a + bxij,X-l + cxij^-2 + dxij,A.-3- 


Spectral-Spatial Method. Optimised Coefficients 

Channel-Row: xj = a + bxij^-1 + cxj 


4. TEST DATA SETS 


The schemes have been evaluated using 3 data sets: the complete radiometrically 
rectified data set for a Jasper Ridge image (Run 05, 07/23/90), and the first six and the 
last six lines of a Moffett Field image (Run 013, 07/23/90). All 224 channels were used. 
Some values of the 16-bit pixels fall outside the nominal 12-bit range. Negative values 
are thought to be caused by radiometric rectification, and values above 4095, by noise. 
The entropies of the three data sets are 9.82, 9.20 and 9.85 bits/pixel, respectively. 
Straight application of a UNIX-like compress algorithm to the two Moffett Field data sets 
yields compressed files of 10.73 and 11.53 bits/pixel, respectively. 


5. RESULTS 


Number of bits per residual : The variation of the compressed image 

entropy as NBIT (see §2) varies from 13 down to 3 bits has been studied, for both 
variable and fixed rate coding. As NBIT decreases, the number of exceedances increases, 
and the compression worsens for variable rate coding. Results below are for 8-bit 
residuals, which entail losses mostly in the range 0.1-0.25 bit/pixel compared with 13-bit 
residuals. The pattern of loss is similar for all the methods. Optimal values of NBIT are 
found for fixed rate coding. 

Spatial Methods : Of the 8 methods using spatial prediction, the one 

named "Two-point Row-Column" provided the best performance (6.88, 6.46, 7.10 
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bits/pixel respectively). To indicate the spread of performance, the worst method for each 
data set produced 7.42, 6.84 and 7.57 bits/pixel, respectively. The optimised methods 
produced residuals with lower standard deviations but any reduction in the residuals' 
entropy was negated by the coefficient overhead. 

Spectral Methods : Of the 6 methods using spectral prediction, that called 

"Two-point Channel" was best (5.90, 5.81, 5.89 bits/pixel). Residuals coded with 
NBIT=13 improve the compression by no more than 0.10 bit/pixel. The "One-point 
Channel" method was only marginally worse, by 0.1 1 bit/pixel for the worst of the three 
data sets. The compression given by the "Channel-Row" method, which uses both 
spectral and spatial data, was intermediate between these two methods. Fixed-coefficient 
Channel DPCM was the worst of all the 15 methods. The "Mean-corrected" method was 
the second worst method for one Moffett Field data set, but it performed better than all the 
spatial methods for the other two data sets. 

Fixed- vs. Variable-Rate Coding : A similar pattern of results holds for fixed- 

rate coding. For the best spatial method, "Two-point Row-Column", allocating 8 bits to 
the residuals provides the best compression overall (8.33, 8.24, 8.45 bits/pixel for the 
respective data sets). For the spectral methods, 6 bits is the optimum, giving compressed 
data of, respectively, 7.38, 7.39, 7.47 bits/pixel. Fixed-rale coding is worse than 
variable-rate coding by about 1.5 bits/pixel. 

Dependence of Results on Data : The results for the best spatial method show 

a spread in compression of 0.64 bit/pixel depending on the data set for variable -rate 
coding, and of 0.21 bit/pixel for fixed-rate coding. For the best spectral method, the 
comparative figures are 0.09 and 0.09 bit/pixel. The results for the spectral method are 
more consistent, varying less across different data sets. 

Noise Sensitivity : The variations in the data in Channels 1-4 and 

Channel 223-224 are dominated by the channel noise (the standard deviation of the data in 
each of these channels is very close to that channel's NER). If these channels are excluded 
from the compression evaluation, then the compressions are improved by about 0.2 
bit/pixel. There are no exceedances for NBIT=12 and NBIT=13 when these channels are 
disregarded. 

6. ENTROPY LIMITS DUE TO INSTRUMENT NOISE 

The noise in each channel causes a spread in values, and so contributes to the 
entropy of the data. The noise entropy of a single channel can be calculated by using its 
NER and assuming a probability distribution. The entropy caused by the noise alone has 
been modelled numerically, by constructing a univariate probability distribution for all 
224 channels. Using the NERs given in the Jasper Ridge and Moffett Field ancillary data 
sets, this noise entropy is found to be 5.28 bits/pixel for a Gaussian distribution of noise 
in each channel, and 5.03 bits/pixel for a Laplacian distribution. For the three data sets, 
the entropies of the residuals produced by the "Three-point channel" spectral method are 
the lowest. For a value of NBIT=8, the entropies are 5.45, 5.36 and 5.45 bits/pixel 
respectively. 

The similarity of the results for the three data sets, and the closeness of these 
results to the theoretical values supports the suggestion that lossless compression using 
spectral information i$ almost limited by the AVIRIS s channel noise. If the probability 
distribution of the residuals is similar to that of noise, then a Huffman codebook for 
variable rate coding might be designed on the basis of instrument parameters, and not 
have to be derived during the compression process. 
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